Completion of the Dictionary

6/23/2021


The Dictionary portion of the Pysistant has been completed. The issues, features, and more are listed below.

Problems and Solutions

I have added a Dictionary to Pysistant by using a combination of the Requests library, Beautiful Soup 4 and Regex to scrape definitions from WordNet.

The implementation of a Dictionary proved to be a simple task to program. The majority of the issues I ran into were of my own creation, due to the amount of features Beautiful Soup 4 and finding a website that was simple enough to scrape but also had the definitions that I wanted.

The amount of features that Beautiful Soup 4 supports had me scraping the documentation for how I should tackle scraping websites for information. After a few days of reading the documentation and watching videos on the library itself I learned about the find_all method and how to use a loop in order to return the information that I wanted to pull from the site. But I still needed a website in order to put this theory into practice.

Finding a website that was simple enough to scrape but also had the definitions that I needed proved to be a difficult and unique challenge. I immediately looked towards larger sites like Merriam-Webster and Dictionary.com but their layouts proved to be either too complex or bloated to scrape efficiently. I looked further and eventually found WordNet. With it's simple design and even simpler HTML I could finally get to work on scraping definitions from the site.

And with a bit of Regex I was able to scrape WordNet for definitions to words based on user input. The majority of the challenge for this project proved to be outside the IDE and led to a resource who's minimalist design triumphed over their more convoluted counterparts.

Follow the development of Pysistant here or on Github
This blog post is tagged: Pysistant.